This directory contains the source code to create cleaned video data which includes

1)the (log) estimated starting impression and growth rate

1)content-unrelated covariates from the crawled video files including

  --hashtag_id
  --video_id
  --log(1st day impression)
  --age of the hashtag when the video gets posted under that hashtag
  --whether the hashtag is still on trending 
  --ranking of the video under the hashtag
  --if_trending*ranking
  --age of the video
  --if_trending*video_age
  --#hashtags
  --#trending_hashtags
  --whether the video cites the hashtag #fyp 
  --log(#following)
  --log(#follower)
  --log(#videos posted)
  --log(#likes)
  --videoLength(not used in Bayesian estimation)

We take log of the estimated growth parameters, 1st day impression, following count, follower count, likes count and video count due to right skewness. 